Spotting scientific and technical specialization in biomedical documents using morphological clues

نویسندگان

  • Jolanta Chmielik
  • Natalia Grabar
چکیده

Distinction of the specialization level of the health documents on Internet is an important indication, especially when documents are read by non expert users such as patients. Indeed, a high technicity of documents impedes the patients to understand the content and may have a negative consequence on their health care process and on their communication with medical doctors. When medical portals propose such a distinction, it is obtained further to a human categorisation. We propose an automatic categorization of health documents according to their specialization. We exploit morphological information obtained thanks to the morphological analysis of lexems. The evaluation shows that precision, recall and f-measure are often higher than 90%. MOTS-CLÉS : documents médicaux, spécialisation, apprentissage supervisé, morphologie constructionnelle, sémantique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

A Region-Based Hashing Approach for Symbol Spotting in Technical Documents

In this paper a geometric hash function able to cluster similar regions and its use for symbol spotting in technical documents is presented. This hashing technique aims to perform a fast spotting process to find candidate locations needing neither a previous segmentation step nor a priori knowledge or learning step.

متن کامل

Comparative Study between Expert and Non-Expert Biomedical Writings: Their Morphology and Semantics

The amount of health information on the internet is constantly growing but little is done for detecting the technicality level of these documents and guiding of users towards documents which are appropriate to their expertise level. The objective of our work is to propose clues for the automatic distinction between expert and non expert medical documents. More precisely, we propose to study the...

متن کامل

Tagging gene and protein names in biomedical text

MOTIVATION The MEDLINE database of biomedical abstracts contains scientific knowledge about thousands of interacting genes and proteins. Automated text processing can aid in the comprehension and synthesis of this valuable information. The fundamental task of identifying gene and protein names is a necessary first step towards making full use of the information encoded in biomedical text. This ...

متن کامل

An Investigation on the Causes of a Rotor Bending and its Thermal Straightening (TECHNICAL NOTE)

Distortion or bend in a turbine rotor (especially HIP rotors) may be caused by a number of factors, either singularly or in combination. In general, the causes of rotor bend can be classified invariably in two categories: Rapidly forming permanent rotor bends and/or Slower forming rotor bends, which could trip the turbines’ emergency stop. One of the major modifying solutions for rapid repairin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TAL

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2011